A topological chaos framework for hash functions 



Abstract 

This paper presents a new procedure of generating hash functions which can be evaluated using some 
mathematical tools. This procedure is based on discrete chaotic iterations. 

First, it is mathematically proven, that these discrete chaotic iterations can be considered as a 
particular case of topological chaos. Then, the process of generating hash function based on the 
topological chaos is detailed. Finally it is shown how some tools coming from the domain of 
topological chaos can be used to measure quantitatively and qualitatively some desirable properties for 
hash functions. An illustration example is detailed in order to show how one can create hash functions 
using our theoretical study. 
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I. Introduction 

Hash functions, such as MD5 or SHA-256, can be described by discrete iterations on a finite set. In this 
paper, the elements of this finite set are called cells. These cells represent the blocks of the text to which the 
hash function will be applied. The origin of this study goes up with the idea of using the concept of discrete 
chaotic iterations for generating new hash functions. This idea gave then quickly rise to the question of 
knowing if discrete chaotic iterations really generate chaos. 

This article presents the research results related to this question. 
First, we prove that under some conditions, discrete chaotic iterations produce chaos, precisely, they produce 
topological chaos in the sense of Devaney. This topological chaos is a rigorous framework well studied in the 
field of mathematical theory of chaos. Thanks to this result we give a process of generating hash functions. 

Behind the theoretical interest connecting the field of the chaotic discrete iterations and the one of 
topological chaos, our study gives a framework making it possible to create hash functions that can be 
mathematically evaluated and compared. 

Indeed, some required qualities for hash functions such as strong sensitivity to the original text, resistance 
to collisions and unpredictability can be mathematically described by notions from the theory of topological 
chaos, namely, sensitivity, transitivity, entropy and expansivity. These concepts are approached but non 
deepened in this article. More detailed studies will be carried out in forthcoming articles. 

This study is the first of a series we intend to carry out. We think that the mathematical framework in 
which we are placed offers interesting new tools allowing the conception, the comparison and the evaluation 
of new methods of encryption in general, not only hash functions. 

The rest of the paper is organized as follows. 
The first next section is devoted to some recalls on two distinct domains, the domain of topological chaos 
and the domain of discrete chaotic iterations. 

Third and fourth sections constitute the theoretical study of the present paper. Section III defines the 
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topological framework in which we are placed while section IV shows that the chaotic iterations produce a 
topological chaos. 

The following section details, using an illustration example, the procedure to build hash functions based on 
our theoretical results. Section VI explains how quantitative measures could be obtained for hash functions. 
The paper ends by some discussions and future work. 

II. Basic recalls 

This section is devoted to basic definitions and terminologies in the field of topological chaos and in the one 
of chaotic iterations. 

1 Devaney's chaotic dynamical systems 

Consider a metric space (X,d), and a continuous function / : X — ► X. 

Definition 1 / is said to be topologically transitive if, for any pair of open sets U, V C X, there exists 
k > such that f k (U) n V + 0. 

Definition 2 An element (a point) x is a periodic element (point) for / of period n G IN, if f n (x) = x. The 
set of periodic points of / is denoted Per(f). 

Definition 3 (X, /) is said to be regular if the set of periodic points is dense in X , 

Vx G X,Ve > 0,3p G Per(f),d(x,p) < e. 

Definition 4 / has sensitive dependence on initial conditions if there exists 5 > such that, for any x G X 
and any neighborhood V of x, there exists y G V and n ^ such that \f n (x) — f n (y)\ > S. 
5 is called the constant of sensitivity of /. 

Let us now recall the definition of a chaotic topological system, in the sense of Devaney @ : 
Definition 5 / : X — ► X is said to be chaotic on X if, 

1. / has sensitive dependence on initial conditions, 

2. / is topologically transitive, 

3. (X, /) is regular. 

Therefore, quoting Robert Devaney: "A chaotic map possesses three ingredients: unpredictability, in- 
decomposability, and an element of regularity. A chaotic system is unpredictable because of the sensitive 
dependence on initial conditions. It cannot be broken down or decomposed into two subsystems, because 
of topological transitivity. And, in the midst of this random behavior, we nevertheless have an element of 
regularity, namely the periodic points which are dense." 

Banks et al. proved in [2[ that sensitive dependence is a consequence of being regular and topologically 
transitive. 
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2 Chaotic iterations 

In the sequel s[n] denotes the n— th term of a sequence s, Vi denotes the i— th component of a vector V, 
and f k denotes the A;— th composition of a function /. Finally, the following notation is used: [l;iV] = 
{1,2,... ,N}. 

Let us consider a system of a finite number N of cells so that each cell has a boolean state. Then a 
sequence of length N of boolean states of the cells corresponds to a particular state of the system. 
A strategy corresponds to a sequence of [1; NJ. The set of all strategies is denoted by S. 

Definition 6 Let S E S. The shift function is denned by 

a: S — ► S 

(S[n]) neK i — > (5[n + l]) n6 w 

and the initial function is the map which associates to a sequence, its first term 

i: S — [1;N] 
(S[n]) nm .— 5[0]. 

B denoting {0, 1}, let / : B N — ► B N and 5 € 5 be a strategy. Let us consider the following so called 
chaotic iterations (see @ for the general definition of such iterations). 



x[0] E B N 

In other words, at the n— th iteration, only the S[n]— th cell is "iterated". Note that in a more gen- 
eral formulation, f(x[n])s[ n ] can be replaced by f(x[k])s[ n ]^ where k ^ n, modelizing for example delay 
transmission (see e.g. CD). 

III. A topological approach of chaotic iterations 

1 The new topological space 

In this section we will put our study in a topological context by defining a suitable set and a suitable distance. 

1.1 Defining the iteration function and the phase space 

Let us denote by 5 the discrete boolean metric, 8(x,y) = <^ x = y, and define the function 

Ff : [1;N] x B N — ► B N 

(k,E) ,— > (E r 5(k,j)+f(E) k .6{k~j)) 

where + and . are boolean operations. 

Consider the phase space 

IN w tdN 



X = [1; NJ x 
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and the map 

G f (S,E) = (a(S),F f (i(S),E)) (2) 
Then one can remark that the chaotic iterations defined in (Q} can be described by the following iterations 

x[o] e X 

X[k + l] = G f (X[k]). 

The following result can be easily proven, by comparing S and R that, 
Theorem 1 The phase space X has the cardinality of the continuum. 

Note that this result is independent on the number of cells. 

1.2 A new distance 

We define a new distance between two points (S, E), (S, E) G X by 

d((S, E); (S, E)) = d e (E, E) + d s (S, S), 

where 

N 



d e (E,E) = Y, S ( E k^ k ), 



fc=i 

It should be noticed that if the floor function [d(X, Y)\ = n, then the strategies X and Y differs in n 
cells and that d(X, Y) — [d(X, Y)\ gives a measure on how the strategies S and S diverge. More precisely, 

• This floating part is less than 10~ fc if and only if the first k terms of the two strategies are equal. 

• If the k— th digit is nonzero, then the k— th terms of the two strategies are different. 

2 Continuity of the iteration function 

To prove that chaotic iterations are an example of topological chaos in the sense of Devaney (H, Gf should 
be continuous on the metric space (X, d). 

Theorem 2 Gf is a continuous function. 

PROOF We use the sequential continuity. 

Let (<5[n], E[n]) n& f^ be a sequence of the phase space X, which converges to (S,E). We will prove 
that (Gf(S[n], £ , [n])) ngM converges to (Gj(S, E)). Let us recall that for all n, S[n] is a strategy, thus, we 
consider a sequence of strategy (i.e. a sequence of sequences). 
As 

d((S[n],E[n]);(S,E)) 
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converges to 0, each distance d e (E[n], E) and d s (S[n], S) converges to 0. But d e (E[n], E) is an integer, so 
3iiq G IN, d e (E[n],E) = for any n ^ hq. 

In other words, there exists threshold no € IN after which no cell will change its state: 

3n G IN, re ^ no => E[n] = E. 

In addition, d s (S[n],S) — ► 0, so 3n\ G IN, d s (S[n],S) < 1CT 1 for all indices greater than or equal to ri\. 
This means that for n ^ n\, all the S[n] have the same first term, which is 5[0]: 

Vn ^ m,5[n][0] = 5[0]. 

Thus, after the max(no, th term, states of E[n] and i£ are the same, and strategies S[n] and S start 
with the same first term. 

Consequently, states of Gf(S[n], E[n\) and Gf(S, E) are equal, then distance d between this two points is 
strictly less than 1 (after the rank max(riQ, n\)). 

We now prove that the distance between (G f(S[n], E[n])) and (Gf(S, E)) is convergent to 0. Let e > 0. 

• If e ^ 1, then we have seen that the distance between (Gf(S[n], E[n])) and (Gf(S,E)) is strictly 
less than 1 after the max(no, ni)-th term (same state). 

• If e < 1, then 3k G IN, 10~ k > e > 10'( fc+1 ). But d s (S[n], S) converges to 0, so 

3n 2 G M,Vn ^ n 2 , d s {S[n], S) < l^ k+2 \ 
after n 2 , the k + 2-th first terms of S[n] and S are equal. 

As a consequence, the k + 1 first entries of the strategies of G f(S[n], E[n\) and Gf(S,E) are the same 
(because Gj- is a shift of strategies), and due to the definition of d s , the floating part of the distance between 
(S[n],E[n]) and (S, E) is strictly less than l(H fc+1 ) ^ e. 

In conclusion, Gf is continuous, 

Ve > 0,3iV = max(n ,ni,n 2 ) G IN, Vn ^ N ,d(G f (S[n],E[n]);G f (S,E)) ^ e. 

In this section, we proved that chaotic iterations can be modelized as a dynamical system in a topological 
space. In the next section, we show that chaotic iterations are a case of topological chaos, in the sense of 
Devaney. 

IV. Discrete chaotic iterations are topological chaos 

To prove that we are in the framework of Devaney 's topological chaos, we will check the regularity and 
transitivity conditions. 
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1 Regularity 

Theorem 3 Periodic points ofGj are dense in X. 

PROOF Let (S, E) e X, and e > 0. We are looking for a periodic point (5', E') satisfying 

d((S,E);(S',E'))<e. 

We choose E' = E, and we "copy" enough entries from S to S' so that the distance between (S f , E) and 
(S, E) is strictly less than e: a number = [_Zo<7io(e)J + 1 of terms is sufficient. 
After this fe-th iterations, the new common state is £, and strategy S is shifted of k positions: a k (S). 
Then we have to complete strategy S' in order to make (E',S') periodic (at least for sufficiently large 
indices). To do so, we put an infinite number of 1 to the strategy S'. Then, either: 

1. The first state is conserved after one iteration, so £ is unchanged and we obtain a fixed point. Or 

2. The first state is not conserved, then: 

• If the first state is not conserved after a second iteration, then we will be again in the first case 
above (due to the negation function). 

• Otherwise the first state is conserved, and we have indeed a fixed (periodic) point. 
Thus, there exists a periodic point into every neighborhood of any point, so (X, G) is regular. 

2 Transitivity 

Contrary to the regularity, the topological transitivity condition is not automatically satisfied by any function 
(/ = Identity is not topologically transitive). 

Let us denote by T the set of maps / such that (X, Gj) is topologically transitive. Then. 

Theorem 4 T is a nonempty set. 

PROOF We will prove that the vectorial logical negation function /o 

/ : B N — ► B N 

(xi, . . . 1 — > (xT, ...,xn) (3) 

is topologically transitive. 

Let A = B(Xa, ta) and B = B{Xs,rB) be two open balls of X. Our goal is to start from a point of A (i.e. 
a point close to Xa) and to arrive in B (a point close to Xb). 

We have to be close to Xa, then the starting state is Ea, it remains to determine the strategy S. We start by 
filling S with the no first terms of strategy Sa of Xa, so that (S, Ea) G Ba- 

Let E be the image of the state Ea by mapping the no-th first terms of the strategy S. This new state E 
differs from Eb by a finite number of states, we put these cells to our strategy S (this adds n\ integers to S). 
In short, starting from (S, Ea), we are in Xb after n + n\ iterations, and the strategy S was shifted of 
no + n\ terms (there is no more term in S). 

In order to be sufficiently close to (Sb,Eb) (at a distance less than e from (Sb, Eb)), we add as much as 
necessarily terms of Sb to S and we complete S with an infinity of terms equal to 1. 
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Remark 1 In fact, we can prove that (X, Gf ) is highly topologicals transitive in the following sense: for 
every (Sa, Ea) and (Sb, Eb) of X, there exists a point sufficiently close to (Sa, Ea) and no £ M such that 

G%(S a ,E a ) = (Sb,Eb). ' 

In conclusion, if / € T 7^ 0, then (Af, Gy) is topologically transitive and regular, and then we have the 
result. 

Theorem 5 V/gT, (A", Gf) is chaotic, in the sense of Devaney. 



V. Hash functions based on topological chaos 

1 Objective 

As an application of the previous theory, we define in this section a new way to generate hash functions 
based on topological chaos. Our approach guarantees to obtain various desired properties in the domain of 
encryption. For example, the avalanche criterion is closely linked to the expansivity property (see the next 
section below). 

The following hash function is based on the vectorial boolean negation /q denned in ([3]). Nevertheless, 
our procedure remains general, and can be applied with any transitive function /. 

2 Application of the new hash function 

Our initial condition XO = (S, E) is composed by: 

• A 256 bits sequence that we call E, obtained from the original text. 

• A chaotic strategy S. 

In the sequel, we describe how to obtain this initial condition (S, E). 
2.1 How to obtain E 

The first step of our algorithm is to transform the message in a normalized 256 bits sequence E. To illustrate 
this step, we take an example, our original text is: The original text 

Each character of this string is replaced by its ASCII code (on 7 bits). Then, we add a 1 to this string. 



10101001 10100011 00101010 00001101 11111100 10110100 
11100111 11010011 10111011 00001110 11000100 00011101 
00110010 11111000 11101001 



Then, we add the binary value of the length of this string, and we add 1 one more time: 



10101001 10100011 00101010 00001101 11111100 10110100 
11100111 11010011 10111011 00001110 11000100 00011101 
00110010 11111000 11101001 11110001 
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Then, the whole string is copied, but in the opposite direction, this gives: 



10101001 
11100111 
00110010 
00111110 
10010111 
10001011 



10100011 
11010011 
11111000 
10011001 
11001110 
0010101 



00101010 
10111011 
11101001 
01110000 
01011010 



00001101 
00001110 
11110001 
01000110 
01111111 



11111100 
11000100 
00011111 
11100001 
01100000 



10110100 
00011101 
00101110 
10111011 
10101001 



So, we obtain a multiple of 512, by duplicating enough this string and truncating at a multiple of 512. 
This string, in which contains the whole original text is denoted by D. 

Finally, we split our obtained string into blocks of 256 bits, and apply to them the exclusive-or function, 
obtaining a 256 bits sequence. 



11111010 
00101000 
01010111 
01000011 
10111000 
10011101 



11100101 
01110100 
00001001 
10101011 
01010010 
01111101 



01111110 
11001101 
00111010 
10010000 
11101110 



00010110 
00010011 
00010011 
11001011 
10000001 



00000101 
01001100 
00100001 
00100010 
10100001 



11011101 
00100111 
01110010 
11001100 
11111010 



So, in the context of subsection (1) , N = 256, and E is the above obtained sequence of 256 bits. Let us 
now build the strategy S. 

We now have the definitive length of our digest. Note that a lot of texts have the same string. This is not 
a problem because the strategy we will build will depends on the whole text. 



2.2 How to choose S 

In order to forge our strategy, i.e. the sequence S of X[0] = (S, E), we use the previously obtained string 
D, and then we start by constructing an intermediate sequence as follows: 

1. We split this string into blocks of 8 bits, and we add to our sequence the corresponding decimal value 
of each octet. 

2. We take then the first bit of this string, and put it on the end. Then we split the new string into blocks 
of 8 bits, and we add in the sequence decimal value associated. 

3. We repeat this operation 6 times. 

The general term of this sequence will be denoted by (ti[n]) n . 

Now, we are able to build our strategy S. The first term of S is the initial term of the preceding sequence. 
The n— th term is the sum (modulo 256) of the three following terms: 

• the n— th term of the intermediate sequence (the strategy depends on the original text), 
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• the double of the n — 1-th term of the strategy (introduction of sensitivity, with the analogy with the 
well known chaotic map 9 i — ► 29 (mod 1)), 

• n (to prevent periodic behaviour). 

So, the general term S[n] of S is defined by 

S[n] = (u[n] + 2 x S[n - 1] + n) (mod 256). 

Strategy S is strong sensitive to the modification of the original text, because the map 9 i — > 29 (mod 1) is 
known to be chaotic in the sense of Devaney. 

2.3 How to construct the digest 

We apply the logical negation function to the S'f/cJ-th term of E, (modulo 256). Indeed, the function / of 
equation ® is defined by 

/: [1,2561 — ► [1,256] 

(£[1],...,£[256]) ■— (E[l],...,E[25e\). 

It is possible to apply the logical negation function several times the same bit. 

We finally split these 256 bits into blocks of 4 bits, this will returns the hexadecimal value: 

63A8 8CB6AF0B18E3BE828F9BDA45 96A6A13DFE38440AB9557DA1C0C6B1EDBDBD 

As a comparison if instead of considering the text "The original text" we took "the original text", the 
hash function returns: 

33E0DFB5BB1D8 8C92 4D2AF8 0B14FF5A7B1A3DEF9D0E831194BD814C8A3B94 8B3 

3 Example 

Consider the following message (a E. A. Poe's poem): 

Wanderers in that happy valley , 

Through two luminous windows , saw 
Spirits moving musically , 

To a lute 's well— tuned law, 
Round about a throne where, sitting 

( Porphyrogene !) 
In state his glory well befitting , 

The ruler of the realm was seen . 

And all with pearl and ruby glowing 

Was the fair palace door, 
Through which came flowing , flowing , 
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And sparkling evermore, 
A troop of Echoes , whose sweet duty 

Was but to sing, 
In voices of surpassing beauty, 

The wit and wisdom of their king. 

Our hash function returns : 

FF51DA4E7E5 0FBA7A8DC6 8 58E9EC3353BDE2E4 6 5E1A6A1B0 3BEAA12A4AD6 9 4FB 

If we put an additional space before " Was the fair palace door," the hash function returns: 

3ABFA4 9B834D52 96 69CFC1AEEC13E14EA5FFD2 34 9582380BCBDBF840 0017 445 

If we replace "Echoes" by "echoes" in the original text, the hash function returns: 

FE547 77C52D37 3B7AED2EA5ACAD422B5B563BB3B91E8FCB48AAE9331DAC54A9B 

VI. Quantitative measures 
1 General definitions 

In the previous section we proved that discrete iterations produce a topological chaos by checking two qual- 
itative properties, namely transitivity and regularity. This mathematical framework offers tools to measure 
this chaos quantitatively. 

The first of this measures is the constant of sensitivity defined in definition 01 

Intuitively, a function / has a constant of sensitivity equals to 5 implies that there exist points arbitrarily 
close to any point x which eventually separate from x by at least 5 under some iterations of /. 
This induces that an arbitrarily small error on a the initial condition may become magnified upon iterations 
of /. (This is related to the famous butterfly effect). 
Other important tools are defined below. 

Definition 7 A function / is said to have the property of expansivity if 

3e > 0, Vx + y, 3n G N, d(f n (x), f n (y)) > e. 
Then, e is the constant of expansivity of /. We also say / is e-expansive. 

Remark 2 A function / has a constant of expansivity equals to e if an arbitrary small error on any initial 
condition is amplified till e. 

There exist other important quantitative tools such as topological entropy, which quantifies the informa- 
tion contained at each iteration. But this is not in the objective of this paper. 

We will reconsider this quantitative measures in the next subsection, in relation with hash functions. 
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2 Quantitative evaluation of our hash function 

Let /o be the vectorial logical negation previously used in our algorithm. In this section, sensitivity and 
expansivity constants of G f will be calculated. 



2.1 Sensitivity 

We know that (X, Gt ) has sensitive dependence on initial conditions. Moreover, we have the following 
result. 

Theorem 6 The constant of sensitivity of (X, Gf ) is equal to N. 

Recall that N = 256 in our hash function. 

PROOF We have seen that sensitivity is a consequence of having Devaney's chaos property. Let us determine 
its constant. 

Let (S, E) be a point of X, and 5 > 0. Then, let us define another point (5", E') by: 

• E' = E, 

• The k-th first terms of S' are the same as those of S, where k = \log\o{e) J + 1 such that 

d((S,E);(S',E'))<5. 

• Then, we put the terms 1, 2, 3, . . . , N to S'. 

• S' can be completed by any terms. 

Then it can be found a point (S',E') closed to (S,E) (d((5, E); (S' , E')) < 5), such that states of Gj+ N (5, E) 

and G^ N (S", E') differ for each cell, so that the distance between this two points is greater or equal to N. 
This proves that we have sensitive dependence on the original text and that the constant of sensitivity is N. 



2.2 Expansivity 

Theorem 7 (X,Gf ) is an expansive chaotic system. Its constant of expansivity is equal to 1. 
Proof If (S, E) + (S; E), then: 

• Either E ^ E, and then at least one cell is not in the same state in E and E. Then the distance 
between (S, E) and (5; E) is greater or equal to 1. 

• Or E = E. Then the strategies S and S are not equal. Let no be the first index in which the terms S 
and S differ. Then 

Vk<n ,G n f °(S,E) = G k fo (S,E), 

and G^ Q °(5, E) / G n f °(S, E), then as E = E, the cell which has changed in E at the n -th iterate is 
not the same than the cell which has changed in E, so the distance between (S, E) and G n £ (S, E) 
is greater or equal to 2. 

The property of expansivity is a kind of avalanche effect. 

Remark that it can be easily proved that (X, Gf ) is not A-expansive, for any A > 1. 



11 



VII. Discussion and future work 



We proved that discrete chaotic iterations are a particular case of Devaney 's topological chaos if the iteration 
function is topologically transitive and that the set of topologically transitive functions is non void. 
We applied our results to the generation of new hash functions. Even if we used the vectorial boolean nega- 
tion function, our procedure remains general and other transitive functions can be used. 
By considering hash functions as an application of our theory, we have shown how some desirable aspects 
in encryption such as unpredictability, sensitivity to initial conditions, mixture and disorder can be mathe- 
matically guaranteed and even quantified by mathematical tools. 

Theory of chaos recalls us that simple functions can have, when iterated, a very complex behaviour, 
while some complicated functions could have foreseeable iterations. This is why it is important to have 
tools for evaluating desired properties. 

In our example, we used a simple topologically transitive iteration function, but it can be proved that there 
exist a lot of functions of this kind. Our simple function may be replaced by other "chaotic" functions which 
can be evaluated with the above described quantitative tools. Another important parameter is the choice of 
the strategy S. We proposed a particular strategy that can be easily improved by multiple ways. 
We do not claim to have proposed a hash function replacing well known ones, we simply wished to show 
how our mathematical context allows to build such functions and especially how important properties can 
be measured. 

Much work remains to be made, for example we are convinced that the good comprehension of the 
transitivity property, enables to study the problem of collisions in hash functions. 

In future work we plan to investigate other forms of chaos such as Li-York chaos (H and to explore other 
quantitative and qualitative tools such as entropy (see e.g. O) and to enlarge the domain of applications of 
our theoretical concepts. 
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